divfont size=3At Bloomberg, we deliver billions of data points to hundreds of thousands of customers every day (and growing) enabling them to make informed business decisions. It is paramount that our customers have reliable access to our services to receive market moving data when they need it. As our customers' needs continue to increase and evolve, our mission is to leverage software engineering, collaboration, and automation to handle the demand and exceed expectations. This is where you come in. /font/divdivfont size=3brThrough software engineering and collaboration with other engineering teams, members of the Data License Reliability Engineering team design, build solutions, and incorporate industry-standard best practices to ensure Data License services run reliably for our customers - increasing the observability of Data License services, automating capacity management of production infrastructure, and reducing the time it takes to resolve issues through deployments and incident response automation. Our team builds and maintains a full-stack application, Synthetic Requests and Notification system (aka SyReN) that monitors Data License services end-to-end, generating service level metrics and triggering alerts when issues are detected. /font/divdivfont size=3brAdditionally, our team routinely assesses and tests Data License services overall through automated game days and proactive chaos testing to ensure the highest level of service reliability is delivered to our customers. /font/divdivfont size=3brstrongWe'll trust you to: /strong/font/divdivullifont size=3Take a "solve this with automation" approach to challenges and issues related to service reliability /font/lilifont size=3Improve the observability of applications, services, and infrastructure systems to help teams understand system performance /font/lilifont size=3Design and propose improvements to software solutions used to measure and monitor the performance of applications and services /font/lilifont size=3Collaborate with application development teams to identify gaps that negatively impact service reliability; including improving monitoring, capacity management, and incident response workflows. /font/lilifont size=3Manage a high-quality and robust production platform and promotion pipeline to ensure available capacity and resources for services /font/lilifont size=3Reduce human toil through automation of manual tasks, steps, and workflows /font/lilifont size=3Work collaboratively with the team to accomplish goals within an agile software development lifecycle /font/li/ul/divdivfont size=3brstrongYou'll need to have: /strong/font/divdivullifont size=34+ years of experience working with an object-oriented programming language (C/C++, Python, Java, etc.) /font/lilifont size=3A Degree in Computer Science, Engineering, Mathematics, similar field of study or equivalent work experience /font/lilifont size=3Preference for data-driven approach to decision making /font/lilifont size=3Creative problem solving approaches that account for existing services, environment and resource limit constraints /font/lilifont size=3Demonstrated understanding or experience working in all levels of the technical stack, from applications to underlying computing infrastructure and machine hardware /font/lilifont size=3Willingness to learn new technologies and adapt to changing priorities /font/li/ul/divdivfont size=3brstrongWe'd love to see: /strong/font/divdivullifont size=3Containerization technologies (Docker, Kubernetes, Mesos) /font/lilifont size=3Chaos testing or similar experience to validate reliability /font/lilifont size=3Infrastructure as code and configuration management tools /font/lilifont size=3Defining and measuring service level indicators and service level objectives for applications, services and infrastructure /font/li/ul/divdivbr/div